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INTELLIGENTLY INTERACTIVE PROFILING SYSTEM AND METHOD 

CROSS-REFERENCE TO RELATED APPLICATIONS 
[0001] This application claims the benefit of U.S. Provisional Patent 
5 Application No. 60/410,905, filed September 13, 2002, titled, "An Intelligently 
Interactive Profiling System", which is incorporated herein by this reference. 

BACKGROUND 

1. Technical Field 

1 0 [0002] The present invention relates to identifying at least one property of 

data. More particularly, the invention concerns an intelligently interactive system for 
identifying at least one property of data. 

2. Description of Related Art 

. 1 5 [0003] It is firequently useful to profile data. For example, data may be 

profiled to determine the expected risk of firaud in a credit card transaction, or the risk 
of terrorism that a fi:eig^t shipment poses, or the risk that a patient has a serious 
medical conditiotL Data profiling can also be applied to ascertain the chances tiiat a 
viewer will enjoy a movie, the chances that a person will be compatible with another 
20 person in a dating service database, or the chances that a stock will go up or down as 
the result of set of economic conditions. 

[0004] Known metiiods of profiling data may involve applying behavioral 
rales prescribed by human experts to the data. As an example, a behavioral rale could 
assign a high level of risk of fimid to a credit card transaction, if flie credit card used for 
25 the transaction has been reported lost. As another example, a behavioral rale could 
assign a high level of terrorism risk to a shipping container, if a high level of 
radioactivity is measured outside the container. 

[0005] One shortcoming of using only behavioral rules prescribed by himian 
experts for profiling data is that the experts may have insufficient knowledge to 
30 prescribe rules. Other shortcomings of using only rules prescribed by human experts 
are that the experts may erroneously prescribe incorrect rules, or may prescribe 
conflicting rules. Anotiier shortcoming is that humans typically cannot quickly develop 

-1 - 
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rules, and are slow to react when there is a need to change rules. Yet another 
shortcoming is that over time, the nxmiber of rules prescribed by human experts may 
grow very large and may require a long time to process, which could result in the 
profiling being too slow for many applications. For example, a method for profiling 
5 data to determine the risk of firaud for a credit card transaction must be able to be 

completed within several seconds in order to be useful for many applications. Anotiier 
shortcoming of using only rules prescribed by human experts is that some of the rules 
may be difficult or impossible to implement Existing methods for profiling data have 
additional shortcomings, such as not having an automatic feedback loop for improving 

10 the rules prescribed by human experts, and not being reactive or proactive to the user. 

[0006] Existing methods for profiling data that utilize machine learning in the 
form of neural networks merely function as black boxes that produce an output, and 
also are not reactive or proactive to the user. The lack of user feedback in these 
methods limits the accuracy of the results, and limits the capability of these methods to 

1 5 adapt to changed circumstances or to correct errors. Further, existing methods that 

utilize machine learning rely excessively on supervised learning, which may limit the 
accuracy and usefulness of the results in cases where feedback is limited or nonexistent. 

[0007] In summary, existing methods for profiling data are inadequate for 
many applications. 
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SUMMARY 

[0008] One aspect of the present invention concerns a method for identifying 
at least one property of data. An example of the method includes the operations of 
receiving data, and making assessments regarding the data. The mettiod also includes 
5 applying at least one behavioral operator, and outputting results. The method further 
comprises receiving feedback concerning system performance. Additionally, the 
method includes adjusting at least one parameter based on tiie feedback received 
concerning system performance, wherein the at least one parameter is a parameter of a 
machine learning method. 
1 0 [0009] Oflier aspects of the invention are described in the sections below, and 

include, for example, a profiling system, and a signal bearing medium tangibly 
embodying a program of machine-readable instructions executable by a digital 
processing apparatus to perform a method for identifying at least one property of data. 
[0010] The invention provides a number of advantages. For example, some 
1 5 examples of the invention advantageously adjust at least one parameter of a machine 
learning method, based on feedback received from a user. The invention also provides 
a number of other advantages and benefits, which should be apparent from the 
following description. 
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BRIEF DESCRIPTION OF THE DRAWINGS 
[0011] FIG. 1 is a block diagram of the hardware components and 

interconnections of a system for identifying at least one property of data, in accordance 

with an example of the invention. 

[0012] FIG. 2 is an example of a signal-bearing mediimi in accordance an 

example of the invention. 

[0013] FIG. 3 is a block diagram illustrating interactions between functional 

elemCT^ts of a system for identifying at least one property of data, and interactions 

between the system and a user, and data sources, in accordance an example of the 

invention. 

[0014] FIGS. 4A-C are a flowchart of an operational sequence for identifying 
at least one property of data in accordance with an example of the invention. 

[0015] FIG. 5 is a depiction of a display of an output showing a plurality of 
membership functions and an indicator in accordance with an example of tiie invention. 
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DETAILED DESCRIPTION 
[0016] The nature, objectives, and advantages of the invention will become 
more apparent to those skilled in the art after considering the foUowmg detailed 
description in connection with the accompanying drawings, 

5 

I, HARDWARE COMPONENTS AND INTERCONNECTIONS 
[0017] One aspect of the invention is a system for identifying at least one 
property of data. The at least one property of the data may be associated with profiling 
the data. As an example, the system may be embodied by the hardware components 
1 0 and interconnections of the computing system shown in FIG. 1 , which will be referred 
to as the profiling system 100. 

[0018] The profiling system 100 includes a processor 104 coupled to a storage 
106. The storage 106 includes a memory 108 aad a nonvolatile memory 110. As an 
example, the memory 108 may be RAM. The nonvolatile memory 1 10 may comprise, 
1 5 for example, one or more magnetic data storage disks such as a hard drive, an optical 
drive, a tape drive, or any other suitable storage device. The storage 106 may store 
programming instructions executed by the processor 104. The profiling system 100 
also includes an output 1 12, and may also include a display 1 14 that is coupled to the 
output 1 12. A keyboard 116 may also be included in the profiling system 100. The 
20 profiling system 100 also includes an input/output 118, such as a line, bus, cable, 

electromagnetic link, or other means for the profiling system 100 to send or receive 
data from external to the profiling system 100. As an example, data may be inputted to 
the profiling system 100 via the mput/output 118. 

[0019] The profiling system 100 may be implemented by any suitable 
25 computing apparatus, such as, for example, a super computer, a mainframe computer, a 
computer workstation, a personal computer, a cluster of computing devices, or a grid of 
computing devices connected over a LAN or WAN. Classified or proprietary data 
should be stored on a secure machine, and in the case of a cluster or grid, should be 
stored on a secure network. In one example, the profiling system 100 is a personal 
30 computer with an Intel processor running the Windows operating system, having the 
maximum available computing power and data access rates, and having the capability 
to back up data. 
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[0020] The profiling system may be implemented in a machine of di£ferent 
construction than the profiling system 100 described above, without departing fix>m the 
scope of the mvention. As an example, the memory 108 or the nonvolatile memory 
110 may be eliminated, or the storage 106 could be provided on-board the processor 
5 104, or the storage 106 could be provided remotely firom the processor 104. 

n. OPERATION 

[0021] In addition to the various hardware embodiments described above, 
another aspect of the invention concerns a method for identifying at least one property 
10 of data. 

A. Signal-Bearing Media 

[0022] In the context of FIG. 1, the method may be implemented, for 

example, by operating the profiling system 100 to execute a sequence of machine- 

1 5 readable instructions, which can also be referred to as code. These instructions may 

reside in various types of signal-bearing media. In this respect, one aspect of the 

present invention concerns a programmed product, comprising a signal-bearing 

medixmi or signal-bearing media tangibly embodying a program of machine-readable 

instructions executable by a digital processing apparatus to perform a method for 

20 identifying at least one property of data. 

[0023] This signal-bearing medium may comprise, for example, the memory 

108 or the nonvolatile memory 110. Alternatively, the Instructions may be embodied in 

a signal-bearing medium such as the optical data storage disc 200 shown in FIG. 2. 

The optical disc could be any type of signal bearing disc, for example, a CD-ROM, 

25 CI)-R, CD-RW, WORM, DVD-R, DVD4^R,DVD-RW, or DVD+RW. Whether 

contained in the profiling systCTi 100 or elsewhere, the instructions may be stored on 

any of a variety of machine-readable data storage mediums or media, which may 

include, for example, direct access storage (such as a conventional "hard drive", a 

RAID array, or a RAMAC), a magnetic data storage diskette (such as a floppy disk), 

30 magnetic tape, digital optical tape, RAM, ROM, EPROM, EEPROM, flash memory, 

magneto-optical storage, paper punch cards, or any other suitable signal-bearing media, 

including transmission media such as digital and/or analog communications links, 
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which may be electrical, optical, and/or wireless. As aa example, the machine-readable 
instructions may comprise software object code, compiled firom a language such as 

"C-H-". 

5 B. Operational Modules of Profiling System 

[0024] FIG. 3 is a block diagram illustrating interactions between functional 
elements of a profiling system 300, and interactions between the profiling system 300 
and auser 302 and data sources 306, 308, 310. The profiling system may, for example, 
pCTfoim a method for identifying at least one property of data. The profiling system 
1 0 300 may be called an iutelligently int^ctive profiling system. 

[0025] As an example, the profiling system 300 may be implemented by a 
software program that may run on the profiling system 100 described above. The 
program may be embodied in signal-bearing media, as discussed above. The profiling 
system 300 includes an interface/control module 304 that may receive input firom the 
1 5 user 302 and present output to the user 302. The interface/control module 304 also may 
receive data firom one or more data sources including a commercial data source 306, a 
government data source 308, and/or other data sources 310. The interface/control 
module 304 is coupled to a data integrity module 3 12 which examines the integrity of 
the data received by the iaterface/control module 304, The interface/control module 
20 304 is also coupled to a behavioral operators module 314, which generates and 
evaluates behavioral operators used by the profiling system 300. The behavioral 
operators module 3 14 may be further configured to adjust one or more behavioral 
operators, and to add new behavioral operators, based on feedback received regarding 
the outputted results. An anomaly detection module 316, for detecting anomalies in the 
25 data, is also coupled to the interface/control module 304. The interface/control module 
304 is also coupled to a machine learning module 318, which performs machine 
learning, such as supervised and/or unsupervised learning, which may be performed 
during the process of identifying at least one property of data (which in some examples 
may be a process for interactive profiling). The interface/control module 304 may 
30 receive feedback regarding outputted results firom the user 302. The iaterface/control 
module 304 may also receive feedback concerning system performance from the user 
302, and may adjust parameters based on the feedback received concerning system 
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perfoimance. The interface/control module 304 may also proactively generate at least 
one suggestion and output the at least one suggestion to the user 302, and solicit 
feedback from the user 302 concerning the generated suggestions. The 
interface/control module 304 may also receive feedback concerning the generated 
5 suggestions, and interpret the feedback received concerning the generated suggestions. 
The operation of the profiling system 300 is discussed further below. 

C. Overall Sequence of Operation 
[0026] An example of the method aspect of the invention is a method for 
1 0 identifying at least one property of an item(s), individual(s), or otiier data The 

properties identified may be desired properties and/or undesired properties. In some 
embodiments, the method may be refrared to as a method for profiling multiple items 
and/or individuals and outputting results regarding the items and/or individuals, in an 
intuitive, user-fiiendly manner. A user may first identify a problem, which may be a 
1 5 recognized deficiency between a current state and a desired state, and a performance 
measure may be constructed to capture the preferences of outcomes. In general, the 
user 302 may query the profiling system 100 with respect to a particular item or 
individual, or possibly multiple items and/or individuals. The user 302 may access the 
profiling system 100 via the keyboard 1 16, or via an external terminal, remote dial-in, 
20 over the Intemet, or via other means of access. 

[0027] The profiling systrai 1 00 may respond by identifying one or more 
characteristics of the item or individual, such as the terrorism risk posed by the item 
(such as a shipping container) or the chances that the user 302 would enjoy dating the 
individual. The following are examples of properties of it^ns, individuals, or other 
25 data, that may be identified by the method aspect of the invention: the (terrorism) risk 
presented by a shipment; the (terrorism) risk presented by a shipping container; the risk 
that an individual is a terrorist; the risk of firaud associated with a credit transaction; the 
risk that cancer (or that another disease) is present; the chances that a person will enjoy 
dating another person; the chances that a person will enjoy a particular movie; etc. 
30 Additional examples include: evaluating (assessing) a restaurant to determine whether 
or not someone would like to eat there; evaluating property (e.g., a house, a car) to 
determine whether or not someone would like to buy or sell it; evaluating customer 
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data to determine which customers would be better selections for marketing altenxative 
products; evaluating applications for a job or entrance to a university for suitability; 
evaluating records of financial filings to determine which records may contain fraud or 
errors; evaluating candidates for drugs to determine which may be particularly 
5 appropriate for addressing a chosen disease; evaluating alternative investments to 

determine which are appropriate or inappropriate for an mvestor; evaluating sporting 
equipment to determine which among a range of possible choices is best for the player; 
evaluating altemative vacation destinations to determine which the user would be most 
or least likely to enjoy; evaluating employee performance to determine if a promotion 
10 is appropriate; and evaluating data pertinent to the health of equipment and predicting 
the onset of failures.. 

[0028] An example of the method aspect of the present invention is illustrated 
in FIGS, 4A-C, which show a sequence 400 for a method for identifying at least one 
property of an item(s), individual(s), or other data. For ease of explanation, but without 
15 any intended limitation, the example of FIGS. 4A-C is described in the context of the 
profiling system 100 and the profiling system 300 described above. In one example, 
the operations of the sequence 400 are performed in the order that they appear in FIGS. 
4A-C. However, it will be apparent to persons skilled in the art that the order of 
performance of the operations of the sequence 400 in many cases may be different than 
20 the order in FIGS. 4A-C. 

[0029] Referring to FIG. 4A, the sequence 400, which may be performed by 
the profiling system 100, may begin with operation 404, which comprises receiving 
data, which may also be called gathering the data. The data may be received from the 
commercial data source 306, the government data source 308, or &om other data 
25 sources 310. The received data may be, for example, a historical set of credit card 
transactions, a shipment history of a shipping company, which possibly could be 
obtained fix>m commercial transaction data, and could also be local, state, federal or 
foreign government data or data from the United Nations. Data could also be obtained 
by making direct measurements. The profiling system 100 may input data once, 
30 repeatedly, or continuously. To receive the data, the profiling system 100 may access 
databases such as the conmxercial data source 306, the government data source 308, 
and/or other data sources 310, which nonmially may contain data regarding the item(s). 
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and/or individual(s). In a shipping container profiling embodiment, the inputted data 
may include, for example, facts involving prior transactions, transportation of goods, 
responsible parties, criminal records, known associates, and oihesr data lhat may be 
pertinent to evaluating the item(s) and/or individual(s) with respect to the purpose of 
5 the user 302. In some examples the profiling system may interrogate the available 
databases with respect to diverse methods of evaluating the profile desired. The 
inputted data may be stored in the storage 106. However, the inputted data does not 
have to be stored in a single storage device or location. 

[0030] The sequence 400 may also include operation 406, which comprises 
1 0 making assessments regarding features and/or the data. The operation of making 

assessments regarding features and the data in many instances may include checking 
the integrity of the data. The data are typically assessed with respect to issues such as: 
—completeness: (determining if any data are missing, and how many data are noissing), 
reliability (determining if the data were collected at the proper time, and determining if 
15 the correct data are being collected); 

-precision: (determining how precise the data are, and determining if measurements 
are being made with sufficient resolution- for example, determining if measurements to 
the nearest meter are being taken when measurements to the nearest centimeter are 
needed); 

20 -accuracy (determining how accurate the data are, and determining if the data are often 
ixx error, and determining why the data are in error- for example, determining if there is 
noise that is inherent to the measurements, and determining if there is noise in the 
system, or in the sensors, or both, and determining how much human error is involved). 
[0031] Assessing the data in the shipping container profiling embodiment may 

25 include, for example, verifying facts regarding the item(s) and/or individual(s) in 

question, such as matching a vehicle identification nimaber with a particular make of 
car, or a passport number with a particular individual. As another example of checking 
data integrity, in an embodiment for deteraiining which movies a user 302 may like, 
information regarding how much money a particular movie made could be gathered 

30 from at least two sources, and the information could be compared to check for 
consistency. 
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[0032] Assessing features typically involves discovering features, as well as 
determining the utility of the features. Discovering features is a process of noting 
repeated patterns over examples, which could be time-dependent or static. For example, 
one feature of the moon is that the bright side always points to the sun. The utility of a 
5 feature comes in how it is used to accomplish some task. For example, once it is known 
that the bright side of the moon always faces the sun, a relationship between the moon 
and the sun can be imputed to try to understand the dynamics of how that feature could 
arise. Features may be mined from data by looking for patterns that are either repeated 
simply as patterns themselves, or as patterns associated with events, such as a seasonal 
1 0 variation in sales for clothing. 

[0033] The interface may allow the user 302 to make adjustments to the 
projBling system 100 to incorporate the user's 302 knowledge. Thus, the sequence 400 
may also include operation 408, which comprises receiving input from the user 302 
concerning the user's knowledge. For example, the user 302 may know that a certain 
1 5 credit card has been stolen but this fact has not yet been reported to commercial data 
warehouses. Based on this knowledge, the user 302 could enter a rule identifying the 
status of the credit card number with an appropriate level of profiling assessment. 
(Profiling assessment is different than assessing the data for data integrity.) As another 
example, the user 302 could have knowledge of the perceived risk associated with 
20 various countries or organizations in the world, which could serve to influence the 

perceived risk associated with shipments of goods from those countries, or goods that 
have been transported through those countries. 

[0034] The sequence 400 may also include operation 410, which comprises 
^qpplying at least one behavioral operator. Behavioral operators may include behavioral 
25 rules and/or suitable mathematical constructs (such as a neural network). The 

behavioral operators represent conditions and/or behavior that are of interest to the user 
302, and are usually based on features of the data. 

[0035] Applying behavioral rules involves developing explicit rules pertaining 
to features, and to associated patterns of behavior. The behavioral rules are typically 
30 conclusions or actions to take (behavior = stimulus-response pair), based on conditions 
detected. For example, if someone smiles and then shakes your hand, and then smiles, 

and then shakes your hand, you might develop the rule IF [Person Smiles] THEN 
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[extend hand]. This is a behavior rule, dippUed to features detected based on sensed 
data. Another example of a behavioral rule is: If data indicate that a shipping container 
was sent from a known criminal, then increase the perceived risk associated with tiie 
container. The following is another example of a behavioral rule: If data mdicate that 
5 a potential immigrant has traveled to countries designated to be of concern to the 
government, then increase the perceived risk associated with the mdividual. Yet 
another example of a behavioral rule is: If data mdicate that a carrier has a record of 
significant violations of laws in prior shipments, then increase the perceived risk 
associated with the carrier and/or item. One example of a behavioral operator, other 
1 0 than a behavioral rule, is a neural network, which in a ganwng example, may **profile" a 
checkerboard based on input features to detemiine the favorability of the particular 
position of checkers on the board. 

[0036] The sequence 400 may also include operation 412, which comprises 
outputting results, so that the results can be displayed for the user 302 (which can also 
15 be called reporting the results). The results are the results of applying at least one 
behavioral operator to the data. The behavioral rules may also be outputted. The 
behavioral rules may be reported according to user preferences, for example, with pros 
and cons sorted separately. The restilts of the profiling system 100 may be presented to 
the user 302 with a graphical-user interface. The results presented on the interface may 
20 indicate overall profiles, level of risk assessed (possibly with associated color coding 
representing various conditions), as well as a report on the rationale of the profiling 
system's evaluation so that the user 302 can understand why the system has reached the 
decision that it reached and what information might be required to make an improved 
evaluation. As an example, the operation of reporting results may comprise outputting 
25 information which is configured to display a plurality of membership fimctions, and an 
indicator showing the relationship between the results and the membership fimctions, as 
is shown in FIG. 5. 

[0037] FIG. 5 is a depiction of a display 500 of an output showing a plurality 

of membership fimctions and an indicator 502. In the display 500 of the output of the 

30 profiling system 100, membership fimctions are displayed as triangles, which, for 

example, may depict a particular shipment's degree of membership in a linguistic risk 

categorization. In this embodiment, the display 500 can be called a risk assessment 
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display. The risk designations are compatible with fuzzy lo^c a form of 

approximate reasoning). 

[0038] FIG. 5 shows how five risk levels may be characterized by 
membership functions. For example, a record that scored as indicated by the small, 
5 inverted arrow is a member of both the elevated risk level (indicated by the large 

upright triangle centered under the number 0.5, and which could be colored yellow) and 
the overlapping high risk level (indicated by the large, upright triangle centered under 
the number 0.66, and which could be colored orange). Thus, the score mdicated by the 
small inverted arrow is well into the elevated level, while it also registers toward the 
1 0 low-end of the high level. The display 500 also includes a low risk level (indicated by 
the horizontal line and the downward slopmg line that intersect under the number 0.16, 
and which could be colored green), and also includes a guarded risk level (indicated by 
the large upright triangle under the number 0.33, and which could be colored blue), and 
also includes a severe risk level (indicated by upward sloping line and the horizontal 
1 5 line that intersect under the number 0.87, and which could be colored red). 

[0039] As an example, linguistic descriptions of risk levels may be defined in 
terms of their respective lower edges as follows: 0.000 may mark the lower edge of the 
low risk level; 0.333 may mark tiie lower edge of the guarded risk level, 0.500 may 
mark the lower edge of the elevated risk level, 0,667 may mark the lower edge of the 
20 high risk level, and 0.875 may mark the lower edge of the severe risk level. In the 
illustrated example, the center of the severe category is shifted slightiy to the right 
(from 0.833 to 0.875). As shown in FIG. 5, the defining values for the high, elevated, 
guarded, and low risk levels are cmtered at the vertexes of their respective triangles 
(although the membership function for low risk could be a trapezoidal fiinction, 
25 symmetric to that of the severe risk category). A linguistic interpretation of the risk, 
such as low, guarded, elevated, higji, or severe, may also be shown in the displayed 
output. 

[0040] Linguistic descriptions of risk levels may also be defined in terms of 
their respective points of intersection between membership functions, indicating 
30 equivalent membership in a linguistic description above and below the associated 
numeric level of assessed risk. 
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[0041] Referring again to FIG. 4A, the sequence 400 may also include 
operation 414, which comprises receiving feedback from the user 302 regarding the 
outputted results. As an example, the user 302 may adjust the profiling system's 100 
response by modifying the profiling rules m terms of the perceived risk or assessment 
5 For example, a user 302 may choose to err on the side of caution and indicate higher 
risks in general operations. Another user may be less risk averse and choose to err on 
the side of indicating a lower risk for certain items and/or individuals who might 
otherwise have been profiled with some evident concern. 

[0042] The sequence 400 may also include operation 416, which comprises 
1 0 adjusting at least one of the behavioral operators based on the feedback received 
regarding the ou^utted results. The user 302 may also directly adjust rules. 

[0043] Referring to FIG. 4B, the sequence 400 may also include operation 
418, which comprises adding at least one new behavioral operator based on the 
feedback received regarding the ou^utted results. 
1 5 [0044] The sequence 400 may also include op^ation 420, which comprises 

analyzing the data. The data may be analyzed continuously, or repeatedly, or only 
initially. The operation of analyzing the data may utilize artificial intelligence methods 
that incorporate machine-learning techniques to adapt the profiling system's 100 
operational rules, inference structure, and/or anomaly detection performance over time. 
20 Such methods may include reioforcement learning, where feedback is given when truth 
is determined regarding specified item(s) and/or individual(s), which permits correcting 
incorrect profiles and reinforcing correct profiles. These methods may also include 
evolutionary computiag in which alternative hypothesized rules regarding the methods 
of generating answers to profiling queries can be optimized over time through an 
25 iterative process of variation and selection. The profiling system 100 may incorporate 
ottier methods, such as neural networks, decision trees, finite state machines, and/or 
other fimctions, which serve to enhance the appropriateness, accuracy, and precision of 
the profiling system 100 in response to user 302 inquiries. The results of checking the 
integrity of the data, the behavioral operators, and detecting anomalies can be used to 
30 improve the machine learning procedures. Furthermore, the results from the machine 
learning can improve the behavioral operators and anomaly detection performance. In 
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addition to the machine learning techniques, a user 302 may update the data sources, 
and profiling modules. 

[0045] The profiling system ICQ may use mathematical logic that is capable 
of handling linguistic concepts. One such logic that can accomplish tbis is fiizzy logic, 

5 which is a form of approximate reasoning. Fuzzy logic accommodates approximate 
relationships, wherein data can be categorized linguistically (for example as high or 
low, or heavy or light), rather than numerically. Other methods of approximate 
reasoning could be incorporated m the profiling system 100. In this manner, the 
profiling system 100 can handle inquiries even if data are omitted, either in small part, 

10 in large part, or completely, and/or if the data are deemed to have less than one-hundred 
percent reUabiUty. 

[0046] Two principal approaches to machine learning are called supervised 

learning and unsupervised learning. Machine learning may also include normal 

statistical methods. With supervised learning, examples of results (outcomes) are 

1 5 available, and mathematical models are generated to relate data inputs to results. The 

mathematical models may include behavioral rales, and may also include mathematical 

constructions other than behavioral rules, such as a neural network, which can be used 

as a new part of the behavioral rule set. The mathematical models may be used to 

produce rule-based scoring. As an example, a neural network may assign a value to a 

20 particular set of inputs, for example, the arrangemmt of playing pieces on a checker 

board. Models may also include human expert rules, such as if A then B, and if C then 

D. In the case of neural networks, evolutionary computing can be used to assign 

weights to inputs and nodes in the networlc, which may also be adjusted by gradient 

methods, annealing, and other meta-heuristics. The mathematical models can then be 

25 used to predict results, if new input data are inputted into the models. As an example, 

several features regarding a patient and the patient's mammogram may be inputted to a 

neural network to obtain an output regarding the level of risk that cancer is present. As 

another example, feature data regarding movies that a person liked and did not like can 

be used generate a model to predict which movies the person will like and will not like. 

30 [0047] The operation 420 of analyzing the data may comprise developing 

mathematical models to explain outcomes, which is discussed above. The sequence 

400 may also include operation 422, which comprises using the mathematical models 
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to generate at least one new behavioral operator (new behavioral rules or new 

mathematical models). The behavioral operators do not have to be expUcit behavioral 

rules, and for example, could be the output of a neural network (which may provide a 

correct output using logic that is not explicitly understood). The sequence 400 may 

5 also include operation 424, which comprises including the at least one new behavioral 

operator in the behavioral operators. The sequence 400 may also mclude operation 

426, which comprises using the mathematical models to delete at least one behavioral 

operator. The sequence 400 may also include operation 428, which comprises using 

the mathematical models to modify at least one behavioral operator, 

1 0 [0048] The operation 420 of analyzing the data may comprise detecting if 

there are one or more anomalies in tiie data, instead of, or in addition to developing 

mathematical models to explain outcomes. The operation of detecting if there are any 

anomaUes in the data, may also include identifying the anomalies. An anomaly is data 

that does not fit into any cluster. In specific applications, anomaUes may be defined, 

1 5 for example, as machine defects, or as the presence of factors that may mdicate terrorist 

activity. For example, hazardous materials (HAZMAT), may present an anomaly in 

shippmg data that may indicate terrorist activity, and warrant further investigation. In 

other instances it may not be known what types of anomalies may be present, and an 

anomaly may be defined as anything unusual in the data. 

20 [00491 Unsupervised learning will generally be used for detecting anomalies. 

Unsupervised learning concerns looking for patterns in data when examples of 

outcomes are not known. With unsupervised learning, examples of outcomes are not 

available or are not used, the data may not be labeled, and the computer looks for 

patterns in the data, and may group the data into clusters. Clusters are formed so as to 

25 maximize similarity of data within each cluster, and to maximize differences between 

different clusters. Anomalies are an indication of unusualness, and may be identified 

by mining data to find data that does not belong in a cluster. Evolutionary computing, 

as well as other methods such as k-means and aimealing, may be used for forming the 

clusters. Different models may be generated for the data, for example, linear, or 

30 nonlinear models. It is possible to generate multiple reasonable sets of clusters for the 

samedatawhenconditioningondifferent aspects of the data's properties. For 

example, a penny, nickel, dime, and quarter (U.S. coins) coxild be clustered by 
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assigning the penny and nickel to one cluster and the dime and quarter to another, with 
the rule being that the clusters separate tiie data based on unit value. An alternative 
would be to assign the penny and dime to one cluster and the nickel and quarter to 
another, thereby partitioning based on the size of the coin. In general, there may be 
5 more than two clusters, and determining an optimal number of clusters is a problem of 
significant mathematical interest with a long history of work. Once the data are 
clustered, additional analysis can indicate which data least belong to any cluster, and 
are therefore anomalous. 

[0050] Mathematical statistical inference, time series analysis methods, 
1 0 pattern recognition, and/or evolutionary computation may be utilized to assess the 
normalcy of an event or condition. For example, an event or condition may be 
described in conditional statements, such as, "If there is a pattem of transportmg 
bananas by truck across an international border every Tuesday, and if it is Tuesday, and 
if the current transported item is not bananas, then increase the perceived risk 
1 5 associated with the transported item." Generally, anomaly detection operates by 

creating statistical descriptions and/or models of what is routine behavior, and then 
identifies behaviors that are not routine. 

[0051] The sequence 400 may also include operation 430, which comprises 
performing additional data integrity testing on a detected anomaly. The sequence 400 
20 may also include operation 432, which comprises generating an alert concerning the 
detected anomaly, to notify the user 302 of the anomaly. 

[0052] Referring to FIG. 4C, the sequence 400 may also include operation 
434, which comprises altering at least one behavioral operator based on the detected 
anomaly. 

25 [0053] The degree to which a record is anomalous, or an outlier, can be 

classified in linguistic categories such as ''not anomalous," "low" degree of anomaly, 
"medium" degree of anomaly, and "high" degree of anomaly. These categories maybe 
defined mmierically, arbitrarily, for each cluster. Following the evolution or other 
determination of clusters, each of the records may have its rule-based risk score 

30 incremented according to its rank among the records. The amount a score is 

incremented may be as follows: 

- ^*not anomalous": no incremental change of score; 

-17- 



wo 2004/025411 



PCT/US2003/028446 



-"low** degree of anomaly: increment score by 1/60 (0.0167); 

-"medium" degree of anomaly: increment score by 2/60 (0.0333); and 

-"high" degree of anomaly: increment score by 3/60 (0.05). 

[0054] The sequence 400 may also include operation 436, which comprises 

5 receiving feedback concerning system performance. The sequence 400 may also 

include operation 438, which comprises adjusting at least one parameter based on the 

feedback received concerning system performance. For example, in a shipping 

screening implementation, the user 302 may input feedback information to the profiling 

system 100 to inform the profiling system 100 that too many alerts are being generated, 

10 or conversely, that not enough alerts are bemg generated. As another example, the 

profiling system 100 may require more time than is available for a particular 

application, and consequently, the parameters of the algorithm are adjusted to make the 

algorithm converge faster. As anottier example, if an evolutionary algorithm is used for 

clustering data, based on clustering performance, the user 302 may desire to revise the 

1 5 number of clusters from, for example, two to three or more, or from four to three or 

less. Other adjustments might be made to the parameters of the clustering algorithm, 

including the population size, the number of parents, the number of offspring per 

parent, the types of variation operators, the type of selection operator, and so forth. As 

another example, in the shipping container screening embodiment, based on evidence 

20 obtained by opening a container and examining the contents, the correctness of the 

possible classification of the contents by the algorithm can be fed back to the method to 

have the method adjust the parameters of its functions. If the process utilizes a neural 

network, the error (if any) of the classification could be used as a basis for adjusting the 

weights (parameters) of the neural network to compensate for the error. As another 

25 example, the user's 302 goals could change, and the user 302 co\ild provide feedback to 

the profiling system 100 so that results will be generated consistent with the new goals. 

[0055] The sequence 400 may also include operation 440, which comprises 

proactively generating at least one suggestion. Suggestions may concern properties of 

available data that are of potential interest to the user. Suggestions may be generated 

30 continuously, or repeatedly, or only once. The sequence 400 may also include 

operation 442, which comprises outputting at least one of the generated suggestions. 

The sequence 400 may also include operation 444, which comprises soliciting feedback 
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concenung the at least one generated suggestion. The sequence 400 may also include 
operation 446, which comprises receiving feedback (fix>m the user 302) concerning at 
least one of the at least one generated suggestions. The sequence 400 may also include 
operation 448, which comprises interpreting the feedback (or lack of feedback) 
5 received concerning at least one of the at least one generated suggestions. The 

profiling system 100 may analyze the data continuously, or repeatedly, or only once, in 
order to generate suggestions. As an example, the profiling system 100 may 
proactively suggest to the user 302 that the user 302 may be interested in dating a 
person in a dating service data base, who the profiling system 100 has determined may 
10 be of interest to the user 302. The profiling systCTOi 100 outputs the suggestion so that it 
can be presented to the user 302 (for example on a display 1 14), and also asks the user 
302 whether the user 302 is interested in the suggested person. The profiling system 
100 may adjust its behavior concerning fixture suggestions, based on the feedback (or 
lack of feedback) received firom the user 302 regarding the person suggested by the 
1 5 profiling system 100. Thus, the profiling systraoL 100 is intelligently interactive, and 

leams to offer data relevant to the user 302 through the class of machine learning called 
reinforcement learning. 

[0056] The sequence 400 need not end after operation 448. Generally, the 
operations of the sequence 400 may be repeated as many times as desired, and as long 
20 as desired. For example, the operation 404 of receiving data may be repeatedly 

performed. Generally one, several, or all, of the operations may be repeated. Results 
may be improved with each iteration of the sequence 400.* 

100571 The decisions, factual instantiations, and other effects of the profiling 
system 100 may be stored in a data warehouse, for use in subsequent profiling 
25 operations, thereby allowing the profiling system 100 to build on its decision making 
and prior performance. Further, additional databases could be added to the system in a 
modular fashion for enhanced performance. 

[0058] In summary, some examples of the invention relate to a method for 
profiling multiple items and/or individuals, and outputting results, which for example, 
30 could indicate a perceived level of risk, in an intuitive user-fiiendly manner. An 

exemplary embodiment is a profiling system for categorizing risk, which includes a 

primary process of rule-based risk scores, and a secondary procedure for anomaly 
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detection using evolutionary computation. The profiling 100 system supplements 
human expertise with machine learning and data mining tools, such as evolutionary 
computing. Many of the examples of the invention benefit firom the synergy of 
performing evolutionary computing for performing data modeling and continuous 
5 anomaly detection, in combination with proactively generating suggestions and 
soliciting feedback firom a user for adjusting the system. 

m. OTHER EMBODIMENTS 
[0059] The preceding disclosure describes a number of illustrative 
1 0 embodiments of the invention. It will be apparent to persons skilled in tiie art that 

various changes and modifications can be made to the described embodiments without 
departing firom the scope of the invention as defined by the following claims. Also, 
although elements of the invention may be described or claimed herein in the singular, 
the plural is contemplated unless limitation to the singular is explicitly stated. 
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CLAIMS 

What is claimed is: 



1 . A signal bearing medium tangibly embodying a program of machine-readable 
5 instructions executable by a digital processing apparatus to perform a method for 
identifying at least one property of data, the method comprising the following 
operations: 

receiving data; 

making assessments regarding the data; 
i 0 applying at least one behavioral operator; 

analyziag the data, wherem the operation of analyzing the data comprises 
detecting if there are any anomalies in the data; 
outputting results; 

receiving feedback concerning system performance; and 
1 5 adjusting at least one parameter based on the feedback received concerning 

system performance, wherein the at least one parameter is a parameter of a machine 
learning method. 

2. The signal bearing medixmi of claim 1, wherein the operations further comprise 
20 repeating the receiving data, making assessments, applying, analyzing, outputting, 

receiving feedback, and adjusting operations. 

3 . The signal bearing medium of claim 1 , wherein the machine learning method 
mvolves a neural network, and wherein the at least one parameter is a weight. 

25 

4. The signal bearing medium of claim 1, wherein the machine learning method is 
an evolutionary algorithm. 

5 . The signal bearing mediimi of claim 1 , wherein the machine learning method is 
30 an evolutionary clustering algorithm. 
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6. The signal bearing medium of claim 1 , wherein the machine learning method is 
reinforcement learning. 

7. The signal bearing medium of claim 1, wherein the machine learning method is 
5 hill-climbing. 

8. The signal bearing medium of claim 1 , wherein the machine learning method is 
annealing. 

10 9. The signal bearing medium of claim 1 , wherein the machine learning method is 
meta-heuristics. 



10. The signal bearing medium of claim 1, wherein the operation of making 
assessments regarding the data further comprises making assessments regarding 

15 features. 

1 1 . The signal bearing medium of claim 1, wherein the operations further comprise 
receiving user knowledge. 

20 12. The signal bearing medium of claim 1, wherein the operation of analyzing the 
data further comprises repeatedly analyzing the data. 

13. The signal bearing medium of claim 1 , wherein the operation of analyzing the 
data further comprises developing at least one mathematical model to explain 

25 outcomes. 

14. The signal bearing medium of claim 13, wherein the operations further 
comprise: 

using the at least one mathematical model to generate at least one new rule; and 
30 using the at least one new rule as one of the behavioral operators. 
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15. The signal bearing medium of claim 13, wherein the operations further 
comprise using the at least one mathematical model to delete at least one behavioral 
operator. 

5 16. The signal bearing medium of claim 13, wherein the operations further 

comprise using the at least one mathematical model to modify at least one behavioral 
rale. 

17. The sigaal bearing medium of claim 1, wherein the operations further comprise 
1 0 performing data integrity testing on a detected anomaly. 

18. The signal bearing medium of claim 1, whwein the operations further comprise 
generating an alert concerning a detected anomaly. 

15 19. The signal bearing medium of claim 1 , wherein the operations further comprise 
altering at least one operational rale based on a detected anomaly. 

20. The signal bearing medixma of claim 1, wherein the operations further comprise: 
proactively generating at least one suggestion; 

20 outputting the at least one generated suggestion; and 

soliciting feedback concerning the at least one generated suggestion. 

21 . The signal bearing medium of claim 20, wherein the operations further 
comprise: 

25 receiving feedback concerning at least one of the at least one generated 

suggestions; and 

interpreting the feedback received concerning at least one of the at least one 
generated suggestions. 

30 22. The signal bearing medium of claim 20: 

wherein the operation of proactively generating at least one suggestion 

comprises repeatedly generating suggestions; and 
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wherein the operation of outputting the at least one suggestion comprises 
outputting each of the generated suggestions. 

23. The signal bearing medium of claim 1, wherein the operation of receiving data 
5 comprises repeatedly receiving data. 

24. The signal bearing medium of claim 1, wherein the data comprises commercial 
data. 

1 0 25. The signal bearing medium of claim 1, wherein the data comprises govenmaent 
data. 

26. The signal bearing medium of claim 1, wherein the operations further comprise: 
receiving feedback regarding the outputted results; and 

1 5 adding at least one new operational rule based on the feedback regarding the 

' outputted results. 

27. The signal bearing medium of claim 1, wherein the operations further comprise: 
receiving feedback regarding the outputted results; and 

20 adjiisting at least one operational operator based on the feedback received 

regarding the outputted results. 

28. The signal bearing medium of claim 1 » wherein the operation of outputting 
results comprises: 

25 outputting rules and results; and 

outputting information configured to display the rules and results according to 
user preferences. 

29. The signal bearing medixrai of claim 1 , wherein the operation of outputting 

30 results comprises outputting inforaiation configured to indicate membership in at least 
one membership function in a plurahty of membership functions. 
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30. The sigaal bearing medium of claim 1 , wherein the operation of outputting 
results comprises outputting information configured to display a plurality of 
membership functions and an indicator showing a relationship between the results and 
the membership functions 

5 

3 1 . The signal bearing mediiun of claim 30, wherein each membership function in 
the plurality of membership functions is associated with a respective level of risk. 

32. The signal bearing medium of claim 1 , wherein the at least one property of the 
1 0 data comprises the risk presented by a shipment 

33. The signal bearing medium of claim 1, wherein the at least one property of the 
data comprises the risk presented by a shipping container. 

1 5 34. The signal bearing medium of claim 1 , wherein the at least one property of the 
data comprises the risk that an individual is a terrorist. 

35. The signal bearing medium of claim 1 , wherein the at least one property of the 
data comprises the risk associated with a credit transaction. 

20 

36. The signal bearing medium of claim 1, wherein the at least one property of tiie 
data comprises the risk that cancer is present. 

37. The signal bearing medium of claim 1 , wherein the at least one property of the 
25 data comprises the chances that a person will enjoy dating another person. 

38. The signal bearing mediimti of claim 1 , wherein the at least one property of the 
data comprises the chances that a p^on will enjoy a particular movie. 

30 39. A signal bearing medium tangibly embodying a program of machine-readable 
instructions executable by a digital processing apparatus to perform a method for 
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identifying at least one property of data, the method comprising tiie following 
operations: 

receiving data; 

making assessments regarding the data; 
5 applying at least one behavioral operator; 

outputting results; 

receiving feedback regarding the outputted results; 
adjusting at least one behavioral operator based on the feedback received 
regarding the outputted results; and 
1 0 analyzing the data, wherein the operation of analyzing the data comprises 

generating at least one machine generated mathematical model to explain outcomes. 

40. The signal bearing medium of claim 39, wherem the operations further 
comprise: 

1 5 proactively generating at least one suggestion; 

outputting the at least one generated suggestion; and 

soliciting feedback concerning the at least one generated suggestion. 

41 . The signal bearing meditmi of claim 40, wherein the operations further 
20 comprise: 

receiving feedback concerning at least one of the at least one generated 
suggestions; and 

interpreting tihie feedback received concerning at least one of the at least one 
generated suggestions. 

25 

42. The signal bearing medium of claim 40, wherein the operations furfhw 
comprise: 

receiving feedback concerning system performance; and 
adjusting at least one parameter based on the feedback received concerning 
30 system performance, wherein the at least one parameter is a parameter of a machine 
learning method. 
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43. The signal bearing medium of claim 39, wherein the operation of analyzing tiie 
data comprises detecting if there are any anomalies in the data. 

44. A signal bearing medium tangibly embodying a program of machine-readable 
5 instructions executable by a digital processing apparatus to perform a method for 

identifying at least one property of data, the method comprising the following 
operations: 

receiving data; 

making assessments regarding the data; 
1 0 checking integrity of the data; 

applying at least one behavioral operator; 

using machine learning to detect if there are any anomalies in the data; 
outputting results; 

proactively generating at least one suggestion; 
1 5 outputting the at least one generated suggestion; and 

soliciting feedback concerning the at least one generated suggestion. 

45. The signal bearing medium of claim 44, wherein the operations fiirther 
comprise repeating the receiving data, making assessments, applying, outputting, 

20 receiving feedback, and adjusting operations. 

46. The signal bearing medium of claim 44, wherein ttie operations further 
comprise: 

receiving feedback concerning at least one of the at least one generated 
25 suggestions; and 

interpreting the feedback received concerning at least one of the at least one 
generated suggestions. 

47. The signal bearing mediirai of claim 44, wherein the operations further 
30 comprise: 

receiving feedback concerning system performance; and 
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adjusting at least one parameter based on the feedback received concerning 
system performance, wherein the at least one parameter is a parameter of a machine 
learning method. 

5 48. The signal bearing medium of claim 44, wherein the operation of using machine 
learning to detect if there are any anomalies in the data comprises xxsing evolutionary 
learning. 

49. The signal bearing medium of claim 44, wherein the operations further 

1 0 comprise analyzing the data, and wherein the operation of analyzing the data comprises 
generating at least one machine generated mathematical model to e^lain outcomes. 

50. A signal bearing medium tangibly embodying a program of machine-readable 
instructions executable by a digital processing apparatus to perform a method for 

1 5 identifying at least one property of data, the method comprising the following 
operations; 

receiving data; 

making assessments regarding features and the data; 
receiving user knowledge; 
20 s^plying at least one behavioral operator; 

outputting results; 

wherein the operation of outputting results comprises outputting information 
configured to display a plurality of membership functions and an indicator showing a 
relationship between the results and the membership functions; 
25 receiving feedback regarding the outputted results; 

adjusting at least one of the at least one behavioral operators based on the 
feedback received regarding the outputted results; 

adding at least one new behavioral operator based on the feedback received 
regarding the outputted results; 
30 analyzing the data; 

wherein the operation of analyzing the data comprises developing at least one 

mathematical model to explain outcomes; 
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using fhe at least one mathematical model to gen^ate at least one new 
behavioral operator; 

including the at least one new behavioral operator in the behavioral operators; 
using the at least one mathematical model to delete at least one behavioral 
5 operator; 

using the at least one mathematical model to modify at least one behavioral 
operator; 

wherein the operation of analyzing the data further comprises detecting if there 
are any anomalies in fhe dat^ 
1 0 performing additional data integrity testing on a detected anomaly; 

generating an alert concerning the detected anomaly; 
altering at least one behavioral operator based on the detected anomaly; 
receiving feedback concerning system performance; 
adjusting at least one parameter based on fhe feedback received concerning 
1 5 system performance, wherein fhe at least one parameter is a parameter of a machine 
learning method; 

proactively generating at least one suggestion; 
outputting the at least one generated suggestion; 
soliciting feedback concerning the at least one generated suggestion; 
20 receiving feedback concerning at least one of the at least one generated 

suggestions; and 

interpreting fhe feedback received concerning at least one of the at least one 
generated suggestions. 

25 51. A signal bearing medium tangibly embodying machine-readable code 

executable by a digital processing apparatus for identifying at least one property of 

data, fhe code comprising: 

a data integrity module configured to examine integrity of the data; 

a behavioral operator module configured generate and evaluate behavioral 

30 operators; 

an anomaly detection module configured to detect anomaUes in the data; 

a machine learning module configured to analyze the dat^ and 
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an interface/controller module coupled to the data integrity module, tiie 
behavioral operators module, the anomaly detection module, and the machine learning 
module; wherein the interface/controller module is configured to receive the data. 

5 52. The signal bearing medium of claim 5 1 , wherein the interface/controller module 
is further configured to: 

proactively generate suggestions; 

output tiie generated suggestions; and 

solicit feedback concerning the generated suggestions. 

10 

53 . The signal bearing medium of claim 52, wherein the interface/controller module 
is further configured to interpret feedback concerning the generated suggestions. 

54. The signal bearing medium of claim 5 1 , wherein the interface/controller module 
15 is further configured to: 

receive feedback concerning system performance; and 
adjust parameters based on the feedback received concerning system 
performance. 

20 55. The signal bearing medium of claim 5 1 : 

wherein the interface/controller module is further configured to output results 
and to receive feedback regarding the outputted results; and 

wherein tiie behavioral operators module is further configured to adjxist the 
behavioral operators based on tiie feedback received regarding the outputted results. 

25 

56. The signal bearing medium of claim 5 1 : 

wherein the interface/controller module is further configured output results and 
to receive feedback regarding outputted results; and 

wherein the behavioral operators module is further configured to add new 
30 behavioral operators based on the feedback received regarding the outputted results. 
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57. A computer data signal embodied in a carrier wave embodying a program of 
machine-readable instructions executable by a digital processing apparatus to perform a 
method for identifying at least one property of data, wherein the method comprises the 
following operations: 
5 receiving data; 

making assessments regarding the data; 
applying at least one behavioral operator; 
detecting if there are any anomalies in the data; 
outputting results; 
1 0 receiving feedback concerning system performance; and 

adjusting at least one parameter based on the feedback received concerning 
system performance, wherein the at least one parameter is a parameter of a machine 
learning method. 

58. A profiling system, comprising: 
a storage; and 

a processor coupled to the storage, wherein the processor is programmed to 
perform the following operations: 
receiving data; 

making assessments regarding the data; 
applying at least one behavioral operator; 
outputting results; 

receiving feedback regarding the outputted results; 
adjusting at least one behavioral operator based on the feedback received 
regarding the outputted results; and 

analyzing the data, wherein the operation of analyzing the data comprises 
generating at least one machine generated mathematical model to explain outcomes. 

59. The profiling system of claim 5 8, wherein the operations further comprise: 
30 proactively generating at least one suggestion; 

outputting the at least one generated suggestion; and 
soliciting feedback concerning the at least one generated suggestion. 
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60. A profiling system, comprising: 
means for receiving data; 

means for making assessments regarding the data; 
means for applying at least one behavioral operator; 
means for outputting results; 

means for receiving feedback concerning system performance; 

means for adjusting at least one parameter based on the feedback received 
concerning system performance, wherein the at least one parameter is a parameter of a 
machine learning method; 

means for analyzing the data; 

means for proactively generating at least one suggestion; 

means for outputting the at least one gen^:ated suggestion; and 

means for soliciting feedback concmiing the at least one generated suggestion. 

61 . A method for identifying at least one property of data, the method comprising 
the following operations: 

receiving data; 

making assessments regarding the data; 
applying at least one behavioral operator; 

analyzing the data, wherein the operation of analyzing the data comprises 
detecting if there are any anomalies in the data; 
outputting results; 

receiving feedback concerning system performance; and 

adjusting at least one parameter based on the feedback received concerning 

system performance, wherein the at least one parameter is a parameter of a machine 

learning method. 

62. A method for identifying at least one property of data, the method comprising 
the following operations: 

receiving data; 

making assessments regarding the data; 
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applying at least one behavioral operator; 
outputting results; 

receiving feedback regarding the outputted results; 

adjusting at least one behavioral operator based on the feedback received 
regarding the outputted results; and 

analyzing the data, wherein the operation of analyzing the data comprises 
generating at least one machine generated mathematical model to explain outcomes. 

63. A mefliod for identifying at least one property of data, the method comprising 
the following operations: 
receiving data; 

making assessments regarding the data; 

checking integrity of the data; 

applying at least one behavioral op^ator; 

generating at least one machine generated mathematical model to ^cplain 
outcomes; 

outputting results; 

proactively generating at least one suggestion; 

outputting the at least one generated suggestion; and 

soliciting feedback concerning the at least one generated suggestion. 
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